FILTER MODE ACTIVE

#feature engineering

Records found: 2

#feature engineering01/11/2025

From Colab to Production: Build an End-to-End Spark + PySpark Pipeline

Hands-on guide to run PySpark in Colab, perform ETL, run SQL and window functions, train a logistic regression model, and save results to Parquet.

READ →

#feature engineering17/08/2025

Hands-On: Build Partitioned Dagster Data Pipelines with CSV IO and a Tiny ML Model

'A practical Dagster tutorial that shows how to build daily-partitioned pipelines, persist assets with a custom CSV IO manager, enforce data-quality checks, and train a small regression model.'

READ →